6_ Food Delivery Data Analysis¶

Import Libraries¶

Let's import all necessary libraries for the analysis and along with it let's bring down our dataset

In [1]:
import pandas as pd 

About Data¶

this time I will let you start from the very beginning by downloading and exploring data on your own from Kaggle https://www.kaggle.com/benroshan/online-food-delivery-preferencesbangalore-region

In [2]:
df =pd.read_csv('onlinedeliverydata.csv')
df.head()
Out[2]:
Age Gender Marital Status Occupation Monthly Income Educational Qualifications Family size latitude longitude Pin code ... Less Delivery time High Quality of package Number of calls Politeness Freshness Temperature Good Taste Good Quantity Output Reviews
0 20 Female Single Student No Income Post Graduate 4 12.9766 77.5993 560001 ... Moderately Important Moderately Important Moderately Important Moderately Important Moderately Important Moderately Important Moderately Important Moderately Important Yes Nil\n
1 24 Female Single Student Below Rs.10000 Graduate 3 12.9770 77.5773 560009 ... Very Important Very Important Very Important Very Important Very Important Very Important Very Important Very Important Yes Nil
2 22 Male Single Student Below Rs.10000 Post Graduate 3 12.9551 77.6593 560017 ... Important Very Important Moderately Important Very Important Very Important Important Very Important Moderately Important Yes Many a times payment gateways are an issue, so...
3 22 Female Single Student No Income Graduate 6 12.9473 77.5616 560019 ... Very Important Important Moderately Important Very Important Very Important Very Important Very Important Important Yes nil
4 22 Male Single Student Below Rs.10000 Post Graduate 4 12.9850 77.5533 560010 ... Important Important Moderately Important Important Important Important Very Important Very Important Yes NIL

5 rows × 55 columns

if you want to show all columns just change pandas options!

In [3]:
pd.set_option('max_columns', 100) #Display up to 100 columns you can also do it for rows !
#pd.set_option("display.max_rows", 100) 
df.head(5)
Out[3]:
Age Gender Marital Status Occupation Monthly Income Educational Qualifications Family size latitude longitude Pin code Medium (P1) Medium (P2) Meal(P1) Meal(P2) Perference(P1) Perference(P2) Ease and convenient Time saving More restaurant choices Easy Payment option More Offers and Discount Good Food quality Good Tracking system Self Cooking Health Concern Late Delivery Poor Hygiene Bad past experience Unavailability Unaffordable Long delivery time Delay of delivery person getting assigned Delay of delivery person picking up food Wrong order delivered Missing item Order placed by mistake Influence of time Order Time Maximum wait time Residence in busy location Google Maps Accuracy Good Road Condition Low quantity low time Delivery person ability Influence of rating Less Delivery time High Quality of package Number of calls Politeness Freshness Temperature Good Taste Good Quantity Output Reviews
0 20 Female Single Student No Income Post Graduate 4 12.9766 77.5993 560001 Food delivery apps Web browser Breakfast Lunch Non Veg foods (Lunch / Dinner) Bakery items (snacks) Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Agree Agree Agree Agree Agree Agree Yes Weekend (Sat & Sun) 30 minutes Agree Neutral Neutral Neutral Neutral Yes Moderately Important Moderately Important Moderately Important Moderately Important Moderately Important Moderately Important Moderately Important Moderately Important Yes Nil\n
1 24 Female Single Student Below Rs.10000 Graduate 3 12.9770 77.5773 560009 Food delivery apps Web browser Snacks Dinner Non Veg foods (Lunch / Dinner) Veg foods (Breakfast / Lunch / Dinner) Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Neutral Agree Strongly agree Strongly agree Agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Yes Anytime (Mon-Sun) 30 minutes Strongly Agree Neutral Disagree Strongly disagree Agree Yes Very Important Very Important Very Important Very Important Very Important Very Important Very Important Very Important Yes Nil
2 22 Male Single Student Below Rs.10000 Post Graduate 3 12.9551 77.6593 560017 Food delivery apps Direct call Lunch Snacks Non Veg foods (Lunch / Dinner) Ice cream / Cool drinks Strongly agree Strongly agree Strongly agree Neutral Neutral Disagree Neutral Disagree Neutral Neutral Agree Agree Agree Agree Agree Agree Agree Strongly agree Agree Neutral Yes Anytime (Mon-Sun) 45 minutes Agree Strongly Agree Neutral Neutral Agree Yes Important Very Important Moderately Important Very Important Very Important Important Very Important Moderately Important Yes Many a times payment gateways are an issue, so...
3 22 Female Single Student No Income Graduate 6 12.9473 77.5616 560019 Food delivery apps Walk-in Snacks Dinner Veg foods (Breakfast / Lunch / Dinner) Bakery items (snacks) Agree Agree Strongly agree Agree Strongly agree Agree Agree Agree Strongly agree Neutral Agree Disagree Disagree Neutral Agree Agree Agree Disagree Disagree Neutral Yes Anytime (Mon-Sun) 30 minutes Disagree Agree Agree Neutral Agree Yes Very Important Important Moderately Important Very Important Very Important Very Important Very Important Important Yes nil
4 22 Male Single Student Below Rs.10000 Post Graduate 4 12.9850 77.5533 560010 Walk-in Direct call Lunch Dinner Non Veg foods (Lunch / Dinner) Veg foods (Breakfast / Lunch / Dinner) Agree Agree Agree Agree Agree Neutral Neutral Agree Strongly agree Strongly agree Agree Strongly agree Agree Disagree Strongly agree Strongly agree Neutral Neutral Neutral Disagree Yes Weekend (Sat & Sun) 30 minutes Agree Agree Agree Agree Agree Yes Important Important Moderately Important Important Important Important Very Important Very Important Yes NIL

Insights

This Data Can be splitted into 8 Groups.

  • Group 1 : Consumer demographics ( Age , Gender , MS, Occupation , Monthly Income ,Educational Qualifications, Family size)
  • Group 2 : Locations (Latitude, Longitude, Pin code)
  • Group 3 : Perefernces (Medium P1 & P2, Meal P1 & P2, Cusine P1 & P2)
  • Group 4 : Satisfaction (Ease & convenient, Time saving, More resturant choise, Easy payment, More effors & discount, Good food quality, Good tracking system)
  • Group 5 : Not purchasing concerns (Self cooking, Health concern, Late delivery, Poor Hygiene, ad past experience, Unavailability, Unaffordable)
  • Group 6 : Cancellation concerns (Long delivery time, Delivery of delivery person getting assigned, Delivery of delivery person picking up food, Wrong ordered deliverd, Missing item, Order placed by mistake)
  • Group 7 : Pereferences (Influence of item, Order time, Maximum wait time, Resident in usy location, Google maps location, Good Rood condition, Low quantity No time, Delivery person ability)
  • Group 8 : Loyality (Influence of rating, Less Delivery time, High quality of package, No. of calls, Politness, Freshness, Temperature of food, Good taste, Good quality, Output 'will purchase again', Review)
In [4]:
df.describe()
Out[4]:
Age Family size latitude longitude Pin code
count 388.000000 388.000000 388.000000 388.000000 388.000000
mean 24.628866 3.280928 12.972058 77.600160 560040.113402
std 2.975593 1.351025 0.044489 0.051354 31.399609
min 18.000000 1.000000 12.865200 77.484200 560001.000000
25% 23.000000 2.000000 12.936900 77.565275 560010.750000
50% 24.000000 3.000000 12.977000 77.592100 560033.500000
75% 26.000000 4.000000 12.997025 77.630900 560068.000000
max 33.000000 6.000000 13.102000 77.758200 560109.000000

Insights:

  • Number of Records are 388
  • Age of Consumers vary between 18 and 33 with an average of 24.6
  • Family size of respondents vary between 1 and 6 with a median of 3
In [5]:
import numpy as np

# Note:
#   pd.pivot table ( df , categorical columns as index ,
#     numerical columns as values , aggregation functions as list of func )

delivery_pivot1=pd.pivot_table(df,index=["Gender","Marital Status"],
                               values=['Age','Family size'],
                               aggfunc=[np.mean,len], margins=True)
delivery_pivot1
Out[5]:
mean len
Age Family size Age Family size
Gender Marital Status
Female Married 27.265306 3.653061 49 49
Prefer not to say 26.000000 4.000000 5 5
Single 23.098214 3.276786 112 112
Male Married 28.016949 3.864407 59 59
Prefer not to say 27.285714 1.571429 7 7
Single 23.455128 3.000000 156 156
All 24.628866 3.280928 388 388
In [6]:
delivery_pivot2=pd.pivot_table(df,index=["Gender"],
                               values=['Age','Family size'],
                               aggfunc=[np.median,len], margins=True)
delivery_pivot2
Out[6]:
median len
Age Family size Age Family size
Gender
Female 24.0 3 166 166
Male 24.5 3 222 222
All 24.0 3 388 388

Insights:

  • Average age of female, male and single is around 23 & 23.4 and 27.2 & 28 for married female and male respectively
  • Average family size is 3.4 for a female respondent and 3.2 for male.
  • Median family size is 3 for a female respondent and 3 for male.
  • There are more (male/female) respondents in our data records.
In [7]:
delivery_pivot3=pd.pivot_table(df,index=["Occupation","Monthly Income"],
                               values=['Age','Family size'],
                               aggfunc=[np.mean, np.median,len], margins=True)
delivery_pivot3
Out[7]:
mean median len
Age Family size Age Family size Age Family size
Occupation Monthly Income
Employee 10001 to 25000 24.304348 3.086957 24 3 23 23
25001 to 50000 26.750000 3.134615 27 3 52 52
Below Rs.10000 26.250000 2.625000 26 2 8 8
More than 50000 27.885714 3.771429 27 4 35 35
House wife No Income 30.333333 4.777778 31 5 9 9
Self Employeed 10001 to 25000 26.066667 3.266667 26 3 15 15
25001 to 50000 26.000000 3.857143 26 3 14 14
More than 50000 26.800000 3.480000 26 3 25 25
Student 10001 to 25000 23.714286 3.571429 24 3 7 7
25001 to 50000 23.000000 2.333333 23 3 3 3
Below Rs.10000 22.764706 3.058824 22 3 17 17
More than 50000 23.000000 3.000000 23 3 2 2
No Income 22.775281 3.162921 23 3 178 178
All 24.628866 3.280928 24 3 388 388
In [8]:
delivery_pivot4=pd.pivot_table(df,index=["Monthly Income"],
                               values=['Age','Family size'],
                               aggfunc=[np.mean, np.median,len], margins=True)
delivery_pivot4
Out[8]:
mean median len
Age Family size Age Family size Age Family size
Monthly Income
10001 to 25000 24.800000 3.222222 25 3 45 45
25001 to 50000 26.434783 3.246377 26 3 69 69
Below Rs.10000 23.880000 2.920000 23 3 25 25
More than 50000 27.290323 3.629032 27 3 62 62
No Income 23.139037 3.240642 23 3 187 187
All 24.628866 3.280928 24 3 388 388

Insights:

  • Most of Students has No Income.
  • Hous wife has highest average family size.
  • Most of Customers has 3 family size with average age .....
  • Most of Customers has No Income with average age 23
  • Most of Customers are students with average age 23
In [9]:
delivery_pivot3_=pd.pivot_table(df,index=["Educational Qualifications","Occupation"],
                               values=['Age','Family size'],
                               aggfunc=[np.mean,len])

#Adding bar for numbers
delivery_pivot3_.style.bar()
Out[9]:
    mean len
    Age Family size Age Family size
Educational Qualifications Occupation        
Graduate Employee 26.514706 3.250000 68 68
House wife 31.666667 4.333333 3 3
Self Employeed 26.655172 3.862069 29 29
Student 21.987013 3.038961 77 77
Ph.D Employee 28.000000 4.083333 12 12
Self Employeed 24.666667 3.000000 3 3
Student 25.250000 3.375000 8 8
Post Graduate Employee 26.236842 3.078947 38 38
Self Employeed 25.571429 2.785714 14 14
Student 23.172131 3.213115 122 122
School House wife 29.200000 5.400000 5 5
Self Employeed 27.142857 3.714286 7 7
Uneducated House wife 32.000000 3.000000 1 1
Self Employeed 30.000000 4.000000 1 1

Insights:

  • The post graduate students have the highest no. of records '122'
  • Graduate employees have the highest no. of records in relation to other employee groups '68'
  • The Post Graduate sel employeed has the smallest no. o family size '2.8'

Univariate Analysis¶

Get minimum 4 Insights of each group.

Draw 1 Graph for each column.

In [10]:
import plotly.express as px
In [11]:
fig = px.histogram(df, x="Age")
fig.show()
In [12]:
fig = px.bar(df, x="Gender")
fig.show()
In [13]:
from matplotlib import pyplot as plt
import seaborn as sns
In [14]:
demog = ['Age', 'Gender', 'Marital Status',"Occupation", 'Monthly Income', "Educational Qualifications", 'Family size']
In [15]:
_, axes = plt.subplots(1, len(demog), figsize=(35, 5))
for i, feat in enumerate(demog):
    sns.countplot(x=feat, data=df, ax=axes[i])
plt.show()

Insights of Group 1 [Demographics] :

  1. Single group is 2.5 higher than married group
  2. Student group is 2 times higher than employee group and 4 times higher than self-employeed group
  3. No Income group is 3 times higher than the highest income group (> 50000)
  4. Family size 2 1nd 3 is the most common family size
In [16]:
peref= ['Medium (P1)', 'Medium (P2)', 'Meal(P1)', 'Meal(P2)', 'Perference(P1)', 'Perference(P2)']
In [17]:
_, axes = plt.subplots(1, len(peref), figsize=(35, 5))
for i, feat in enumerate(peref):
    sns.countplot(x=feat, data=df, ax=axes[i])
plt.show()

Insights of Group 2 [Perefrences]:

  1. Food delivery apps group is representing over 90% of the respondants in Medium P1
  2. Direct calls group is the highest group in Medium P2
  3. Snacks and Lunch are the major groups in Meal P1, while dinner is the major group in Meal P2
  4. Non-veg foods is the major group in Perefrence P1, while Veg foods is the major group in Perefrence P2
In [18]:
satis = ['Ease and convenient', 'Time saving', 'More restaurant choices', 'Easy Payment option', 'More Offers and Discount', 'Good Food quality', 'Good Tracking system']
In [19]:
_, axes = plt.subplots(1, len(satis), figsize=(35, 5))
for i, feat in enumerate(satis):
    sns.countplot(x=feat, data=df, ax=axes[i])
plt.show()

Insights of Group 3 [Satisfaction] :

  1. Most of the respondants agree that online orders are characterized by Ease and convenient and Time saving
  2. Most of the respondants agree that online orders allow more restaurants choises and Easy payment options
  3. Most of participants agree that online orders offer more discounts and offers
  4. Respondants are divided around if online orders asure good food quality or not
In [20]:
purc = ['Self Cooking', 'Health Concern', 'Late Delivery', 'Poor Hygiene', 'Bad past experience', 'Unavailability', 'Unaffordable']
In [21]:
_, axes = plt.subplots(1, len(purc), figsize=(80, 5))
for i, feat in enumerate(purc):
    sns.countplot(x=feat, data=df, ax=axes[i])
plt.show()

Insights of Group 4 [Not Purchasing] :

  1. Respondants are divided around if self-cooking may cause not-purchasing or not
  2. More respondants agree than disagree that health concern and poor hygiene may cause not-purchasing
  3. Most of participants agree that late delivery may cause not-purchasing
  4. More respondants disagree than agree that unavailability and unaffordaility may cause not-purchasing
In [22]:
canc = ['Long delivery time', 'Delay of delivery person getting assigned', 'Delay of delivery person picking up food', 'Wrong order delivered', 'Missing item', 'Order placed by mistake']
In [23]:
_, axes = plt.subplots(1, len(canc), figsize=(80, 5))
for i, feat in enumerate(canc):
    sns.countplot(x=feat, data=df, ax=axes[i])
plt.show()

Insights of Group 5 [Cancellation] :

  1. Most of participants agree to cancel the order in case of the long delivery, regardless to the reason
  2. Most of respondants disagree to cancel the order in case of the missing items or the orders placed by mistake
In [24]:
perf = ['Influence of time', 'Order Time', 'Maximum wait time', 'Residence in busy location', 'Google Maps Accuracy', 'Good Road Condition', 'Low quantity low time', 'Delivery person ability']
In [25]:
_, axes = plt.subplots(1, len(perf), figsize=(80, 5))
for i, feat in enumerate(perf):
    sns.countplot(x=feat, data=df, ax=axes[i])
plt.show()

Insights of Group 6 [Perefreneces] :

  1. The maximum wait time or most of the Respondants are ranged between 30 and 45 minutes
  2. Most of respondants have no specific time for ordering, which is could be anytime from Mon. to Sun.
  3. Most of participants agree that time of delivery has infleunce on their orders
  4. More respondants agree that the location accurancy on google maps and good road conditions affect on time delivery
In [26]:
loyl = ['Influence of rating', 'Less Delivery time', 'High Quality of package', 'Number of calls', 'Politeness', 'Freshness ', 'Temperature', 'Good Taste ', 'Good Quantity', 'Output']
In [27]:
_, axes = plt.subplots(1, len(loyl), figsize=(80, 5))
for i, feat in enumerate(loyl):
    sns.countplot(x=feat, data=df, ax=axes[i])
plt.show()

Insights of Group 7 [Loyality] :

  1. Most of participants will be affected by the resturant rating before ordering
  2. Most of respondants find that the less delivery time is important for ordering
  3. Respondants are divided around important and very importnat for the factors 'package high quality, no. of calls to make the order, poltiness of the dlivery person, freshness, temperature, good taste and quantity od food' are to make further orders
  4. More respondants agree than disagree that they will purchase again

Multivariate Analysis¶

Multivariate analysis between all groups.

Insights:

  1. Gender vs Good food quality
  2. Gender vs Self Cooking
  3. Marital status vs self cooking
  4. Monthly income vs Health concern
  5. Occupation vs Poor hygiene
  6. Family siz vs Bad past experience
  7. Educational qualification vs Poor hygiene
  8. Gender vs long delivery time
  9. Age vs Wrong order delivered
  10. Occupation vs Missing item
  11. Monthly income vs Order placed by mistake
  12. Family size vs influence of time
  13. Occupation vs influence of time
  14. Marital status vs Output
  15. Monthly income vs More offers & discounts
  16. Monthly income vs Good food quality
  17. Monthly income vs Unaffordability
  18. Family size vs Unavailability
  19. Gender vs Unavailability
  20. Gender vs Maximum wait time
In [28]:
# 1. Gender vs Good Food quality
sns.countplot(x='Gender', hue='Good Food quality', data=df)
Out[28]:
<AxesSubplot:xlabel='Gender', ylabel='count'>
In [29]:
import plotly.express as px
In [30]:
# 1. Gender vs Good Food quality
fig = px.sunburst(data_frame=df, path=('Gender', 'Good Food quality'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. Male participants agree more than female participants that online orders offer good quality food
  2. Male participants disagree more than female participants that online orders offer good quality food
  3. Female respondants with nutral openion on online food quality are more than mare respondants
In [31]:
# 2. Gender vs self cooking
fig = px.sunburst(data_frame=df, path=('Gender', 'Self Cooking'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. Female participants agree more than male participants that self-cooking may cause not purchasing
  2. While male respondants disagree more than female ones that self-cooking may cause not purchasing
In [32]:
# 3. Marital status vs self cooking
fig = px.sunburst(data_frame=df, path=('Marital Status', 'Self Cooking'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. Married participants agree more than single participants that self-cooking may cause not purchasing
  2. As well married respondants also disagree more than single ones that self-cooking may cause not purchasing
In [33]:
# 4. Monthly income vs Health concern
fig = px.sunburst(data_frame=df, path=('Monthly Income', 'Health Concern'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. The highest income group had the highest share, which disagree that health concern may cause not-purchasing
  2. As well the highest income group had the lowest share, which agree that health concern may cause not-purchasing
In [34]:
# 5. Occupation vs Poor Hygiene
fig = px.sunburst(data_frame=df, path=('Occupation', 'Poor Hygiene'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. The student group had the highest share of agree and the lowest share of disagree that poor hygiene may cause not-purchasing
  2. The highest occupation group with disagree openion, which agree that poor hygiene may cause not-purchasing
  3. The employee group had the highest share of nutral openion, regard the poor hygiene and if may cause not-purchasing
In [35]:
# 6. Family size vs Bad past experience
fig = px.sunburst(data_frame=df, path=('Family size', 'Bad past experience'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. The highest family size with 6 members had the highest share of agree and disagree that bad past experience may cause not-purchasing, but the family size 6 represtents only 7% of the whole sample.
In [36]:
# 7. Educational Qualifications vs Poor Hygiene
fig = px.sunburst(data_frame=df, path=('Educational Qualifications', 'Poor Hygiene'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. Around of half of PHD group disagree that poor hygiene may cause not-purchasing
  2. The other groups had simlar shares etween agree, disagree and nutral that poor hygiene may cause not-purchasing
In [37]:
# 8. Gender vs Long Delivery Time
fig = px.sunburst(data_frame=df, path=('Gender', 'Long delivery time'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. Similar shares between males and females that agree or disagree that long delivery time may cause cancellation
In [38]:
# 9. Age vs Long Order delivered
fig = px.sunburst(data_frame=df, path=('Age', 'Wrong order delivered'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
In [39]:
# 10. Occupation vs Missing item
fig = px.sunburst(data_frame=df, path=('Occupation', 'Missing item'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. The majority of the self-empolyeed group disagree to make order cancellation because of a missing item
  2. Similar shares of agree and disagree to the student group that could cancel the order with a missing item
In [40]:
# 11. Monthly Income vs Order Placed by mistake
fig = px.sunburst(data_frame=df, path=('Monthly Income', 'Order placed by mistake'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. 50% to 75% disagree to cancel an order placed by mistake around the participated groups with income.
  2. Similar shares amoung agree, disagree and nutral for the no income group could cancel an order placed by mistake
In [41]:
# 12. Family Size vs Influence of time
fig = px.sunburst(data_frame=df, path=('Family size', 'Influence of time'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. Family size 4 group had the highest share, which agree that inlfluence of time affects on ordering
  2. The rest family size groups agree with around 75% that inlfluence of time affects on ordering
In [42]:
# 13. Occupation vs Influence of time
fig = px.sunburst(data_frame=df, path=('Occupation', 'Influence of time'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. The agreement that time influence affects making orders is the majority than disagreement
  2. The student group had the highest agree share amoung the other groups with 81%
  3. While the house wife group had the lowest share amoung the other groups with 44%
In [43]:
# 14. Marital Status vs Output
fig = px.sunburst(data_frame=df, path=('Marital Status', 'Output'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. The single and married group decided to purchase again online more than not, while the share of student group is higher than the married one
  2. The third group with only 12 respondants gives equal shares for purchasing and not-purchasing online again
In [44]:
# 15. Monthly Income vs More Offers and Discount
fig = px.sunburst(data_frame=df, path=('Monthly Income', 'More Offers and Discount'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. Similar shares amoung the income groups for the different openions that online ordering offers more ofers and discount with a more agree share than disagree
In [45]:
# 16. Monthly Income vs Good Food quality
fig = px.sunburst(data_frame=df, path=('Monthly Income', 'Good Food quality'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. The highest income value group had the highest share, which agrees that online food had good quality
  2. While the lowest income value group had the lowest share, which agrees that online food had good quality
In [46]:
# 17. Monthly Income vs Unaffordable
fig = px.sunburst(data_frame=df, path=('Monthly Income', 'Unaffordable'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. The highest income value group had the highest share, which disagrees that unaffordability may cause not-purchasing
  2. While the lowest income value group had the lowest share, which disagrees that unaffordability may cause not-purchasing
In [47]:
# 18. Family size vs Unavailability
fig = px.sunburst(data_frame=df, path=('Family size', 'Unavailability'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. The highest family size value group had the highest share, which disagrees that unavailability may cause not-purchasing
  2. The family size 5 group had the highest share, which agrees that unavailability may cause not-purchasing
In [48]:
# 19. Gender vs Unavailability
fig = px.sunburst(data_frame=df, path=('Gender', 'Unavailability'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. Male group had the highest share, which disagrees that food unavailability may cause not-purchasing
In [49]:
# 20. Gender vs Maximum wait time
fig = px.sunburst(data_frame=df, path=('Gender', 'Maximum wait time'))
fig.update_traces(textinfo='label+percent parent')
fig.show()
  1. Female group had its maximum wait time as 45 minutes, while male group had its maximum wait time as 30 minutes
In [ ]:
 

Geographical Analysis¶

In [50]:
import pandas as pd
dd = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/earthquakes-23k.csv')

dd.head(4)
Out[50]:
Date Latitude Longitude Magnitude
0 01/02/1965 19.246 145.616 6.0
1 01/04/1965 1.863 127.352 5.8
2 01/05/1965 -20.579 -173.972 6.2
3 01/08/1965 -59.076 -23.557 5.8
In [51]:
df.head(4)
Out[51]:
Age Gender Marital Status Occupation Monthly Income Educational Qualifications Family size latitude longitude Pin code Medium (P1) Medium (P2) Meal(P1) Meal(P2) Perference(P1) Perference(P2) Ease and convenient Time saving More restaurant choices Easy Payment option More Offers and Discount Good Food quality Good Tracking system Self Cooking Health Concern Late Delivery Poor Hygiene Bad past experience Unavailability Unaffordable Long delivery time Delay of delivery person getting assigned Delay of delivery person picking up food Wrong order delivered Missing item Order placed by mistake Influence of time Order Time Maximum wait time Residence in busy location Google Maps Accuracy Good Road Condition Low quantity low time Delivery person ability Influence of rating Less Delivery time High Quality of package Number of calls Politeness Freshness Temperature Good Taste Good Quantity Output Reviews
0 20 Female Single Student No Income Post Graduate 4 12.9766 77.5993 560001 Food delivery apps Web browser Breakfast Lunch Non Veg foods (Lunch / Dinner) Bakery items (snacks) Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Agree Agree Agree Agree Agree Agree Yes Weekend (Sat & Sun) 30 minutes Agree Neutral Neutral Neutral Neutral Yes Moderately Important Moderately Important Moderately Important Moderately Important Moderately Important Moderately Important Moderately Important Moderately Important Yes Nil\n
1 24 Female Single Student Below Rs.10000 Graduate 3 12.9770 77.5773 560009 Food delivery apps Web browser Snacks Dinner Non Veg foods (Lunch / Dinner) Veg foods (Breakfast / Lunch / Dinner) Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Neutral Agree Strongly agree Strongly agree Agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Yes Anytime (Mon-Sun) 30 minutes Strongly Agree Neutral Disagree Strongly disagree Agree Yes Very Important Very Important Very Important Very Important Very Important Very Important Very Important Very Important Yes Nil
2 22 Male Single Student Below Rs.10000 Post Graduate 3 12.9551 77.6593 560017 Food delivery apps Direct call Lunch Snacks Non Veg foods (Lunch / Dinner) Ice cream / Cool drinks Strongly agree Strongly agree Strongly agree Neutral Neutral Disagree Neutral Disagree Neutral Neutral Agree Agree Agree Agree Agree Agree Agree Strongly agree Agree Neutral Yes Anytime (Mon-Sun) 45 minutes Agree Strongly Agree Neutral Neutral Agree Yes Important Very Important Moderately Important Very Important Very Important Important Very Important Moderately Important Yes Many a times payment gateways are an issue, so...
3 22 Female Single Student No Income Graduate 6 12.9473 77.5616 560019 Food delivery apps Walk-in Snacks Dinner Veg foods (Breakfast / Lunch / Dinner) Bakery items (snacks) Agree Agree Strongly agree Agree Strongly agree Agree Agree Agree Strongly agree Neutral Agree Disagree Disagree Neutral Agree Agree Agree Disagree Disagree Neutral Yes Anytime (Mon-Sun) 30 minutes Disagree Agree Agree Neutral Agree Yes Very Important Important Moderately Important Very Important Very Important Very Important Very Important Important Yes nil
In [52]:
import plotly.express as px
fig = px.density_mapbox(df, lat='latitude', lon='longitude', z='Age', radius=10,
                        center=dict(lat=0, lon=180), zoom=0,
                        mapbox_style="stamen-terrain")
fig.show()

Draw Scatter Mapbox based on Maximum Wait time¶

replace Age with Maximum wait time after adjusting it to numerical values !

In [53]:
df['max_wait_min'] = df['Maximum wait time'].str.replace("minutes","").str.replace("more than","")
df['max_wait_min']
Out[53]:
0      30 
1      30 
2      45 
3      30 
4      30 
      ... 
383    30 
384    45 
385    45 
386    45 
387    30 
Name: max_wait_min, Length: 388, dtype: object
In [54]:
df.head(5)
Out[54]:
Age Gender Marital Status Occupation Monthly Income Educational Qualifications Family size latitude longitude Pin code Medium (P1) Medium (P2) Meal(P1) Meal(P2) Perference(P1) Perference(P2) Ease and convenient Time saving More restaurant choices Easy Payment option More Offers and Discount Good Food quality Good Tracking system Self Cooking Health Concern Late Delivery Poor Hygiene Bad past experience Unavailability Unaffordable Long delivery time Delay of delivery person getting assigned Delay of delivery person picking up food Wrong order delivered Missing item Order placed by mistake Influence of time Order Time Maximum wait time Residence in busy location Google Maps Accuracy Good Road Condition Low quantity low time Delivery person ability Influence of rating Less Delivery time High Quality of package Number of calls Politeness Freshness Temperature Good Taste Good Quantity Output Reviews max_wait_min
0 20 Female Single Student No Income Post Graduate 4 12.9766 77.5993 560001 Food delivery apps Web browser Breakfast Lunch Non Veg foods (Lunch / Dinner) Bakery items (snacks) Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Neutral Agree Agree Agree Agree Agree Agree Yes Weekend (Sat & Sun) 30 minutes Agree Neutral Neutral Neutral Neutral Yes Moderately Important Moderately Important Moderately Important Moderately Important Moderately Important Moderately Important Moderately Important Moderately Important Yes Nil\n 30
1 24 Female Single Student Below Rs.10000 Graduate 3 12.9770 77.5773 560009 Food delivery apps Web browser Snacks Dinner Non Veg foods (Lunch / Dinner) Veg foods (Breakfast / Lunch / Dinner) Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Neutral Agree Strongly agree Strongly agree Agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Strongly agree Yes Anytime (Mon-Sun) 30 minutes Strongly Agree Neutral Disagree Strongly disagree Agree Yes Very Important Very Important Very Important Very Important Very Important Very Important Very Important Very Important Yes Nil 30
2 22 Male Single Student Below Rs.10000 Post Graduate 3 12.9551 77.6593 560017 Food delivery apps Direct call Lunch Snacks Non Veg foods (Lunch / Dinner) Ice cream / Cool drinks Strongly agree Strongly agree Strongly agree Neutral Neutral Disagree Neutral Disagree Neutral Neutral Agree Agree Agree Agree Agree Agree Agree Strongly agree Agree Neutral Yes Anytime (Mon-Sun) 45 minutes Agree Strongly Agree Neutral Neutral Agree Yes Important Very Important Moderately Important Very Important Very Important Important Very Important Moderately Important Yes Many a times payment gateways are an issue, so... 45
3 22 Female Single Student No Income Graduate 6 12.9473 77.5616 560019 Food delivery apps Walk-in Snacks Dinner Veg foods (Breakfast / Lunch / Dinner) Bakery items (snacks) Agree Agree Strongly agree Agree Strongly agree Agree Agree Agree Strongly agree Neutral Agree Disagree Disagree Neutral Agree Agree Agree Disagree Disagree Neutral Yes Anytime (Mon-Sun) 30 minutes Disagree Agree Agree Neutral Agree Yes Very Important Important Moderately Important Very Important Very Important Very Important Very Important Important Yes nil 30
4 22 Male Single Student Below Rs.10000 Post Graduate 4 12.9850 77.5533 560010 Walk-in Direct call Lunch Dinner Non Veg foods (Lunch / Dinner) Veg foods (Breakfast / Lunch / Dinner) Agree Agree Agree Agree Agree Neutral Neutral Agree Strongly agree Strongly agree Agree Strongly agree Agree Disagree Strongly agree Strongly agree Neutral Neutral Neutral Disagree Yes Weekend (Sat & Sun) 30 minutes Agree Agree Agree Agree Agree Yes Important Important Moderately Important Important Important Important Very Important Very Important Yes NIL 30
In [55]:
import plotly.express as px
fig = px.density_mapbox(df, lat='latitude', lon='longitude', z='max_wait_min', radius=10,
                        center=dict(lat=0, lon=180), zoom=0,
                        mapbox_style="stamen-terrain")
fig.show()

Build A Data Story¶

  1. Convert your insights and graphs into slides show.
  2. run this command in your cmd. jupyter nbconvert Jupyter\ 6_Food_Delivery_Analysis.ipynb --to slides --post serve
In [ ]:
#----------------------------# Load Your Dependencies#--------------------------#
import dash
from jupyter_dash import JupyterDash  # jupyter dash helps you to deploy dash on jupyter lab
from dash import  dcc    # Dash core Components
from dash import html   # HTML for Layout and Fonts
import plotly.express as px           # Plotly Graphs uses graph objects internally
from plotly.subplots import make_subplots
import plotly.graph_objects as go     # Plotly Graph  more customized 
import pandas as pd                   # Pandas For Data Wrangling
from dash import Input, Output, dash_table  # Input, Output for  Call back functions

#--------------------------#Instanitiate Your App#--------------------------#

app = JupyterDash(__name__)  

#--------------------------# Pandas Section #------------------------------#

df =pd.read_csv('onlinedeliverydata.csv')
group = ['Consumer demographics', 'Meal perefernces', 'Satisfaction', 'Not purchasing concerns', 'Cancellation concerns', 'Pereferences', 'Loyality']
demog = ['Age', 'Gender', 'Marital Status',"Occupation", 'Monthly Income', "Educational Qualifications", 'Family size']
peref= ['Medium (P1)', 'Medium (P2)', 'Meal(P1)', 'Meal(P2)', 'Perference(P1)', 'Perference(P2)']
satis = ['Ease and convenient', 'Time saving', 'More restaurant choices', 'Easy Payment option', 'More Offers and Discount', 'Good Food quality', 'Good Tracking system']
purc = ['Self Cooking', 'Health Concern', 'Late Delivery', 'Poor Hygiene', 'Bad past experience', 'Unavailability', 'Unaffordable']
canc = ['Long delivery time', 'Delay of delivery person getting assigned', 'Delay of delivery person picking up food', 'Wrong order delivered', 'Missing item', 'Order placed by mistake']
perf = ['Influence of time', 'Order Time', 'Maximum wait time', 'Residence in busy location', 'Google Maps Accuracy', 'Good Road Condition', 'Low quantity low time', 'Delivery person ability']
loyl = ['Influence of rating', 'Less Delivery time', 'High Quality of package', 'Number of calls', 'Politeness', 'Freshness ', 'Temperature', 'Good Taste ', 'Good Quantity', 'Output']
all_g = [demog, peref, satis, purc, canc, perf, loyl]
df_piv=pd.pivot_table(df,index=["Educational Qualifications","Occupation"], values=['Age','Family size'],
                               aggfunc=[np.median,len])
#df_piv.style.bar()
df_piv.reset_index().to_dict('rows')
#--------------------------------------------------------------------------#
    
app = JupyterDash(__name__)  

app.layout = html.Div([html.Div([html.A([html.H2('Food Delivery Analysis Dashboard'),html.Img(src='/assets/logo1.png')],  # A for hyper links
                                        href='http://projectnitrous.com/')],className="banner"),
                       # First raw
                       html.Div([
                           html.H4('Univariate Analysis for the data groups', style={'color':'#ffffff'}),
                       ], className="eleven columns", style={'padding':10}),
                       # Total Revenue
                       html.Div([
                           dcc.Dropdown(
                                id='dropdown_grp',
                                options=[{'label':group[i], 'value':i} for i in range(len(group))],
                                value=0,
                                multi=False,
                                searchable=True,
                                clearable=False
                           ),
                           html.Div(
                               dcc.Graph(id='hist_grp'),
                           ),
                       ], className="eleven columns", style={'backgroundColor': '#2a2b4a', 'padding':10}),
                       #, 'border-radius': 25
                       
                       html.Div([html.Br(),],className="eleven columns"),
                       # Second raw
                       html.Div([
                           html.H4('Multivariate Analysis', style={'color':'#ffffff'}),
                       ], className="five columns", style={'padding':10}),
                       html.Div([
                           html.H4('Multivariate Analysis', style={'color':'#ffffff'}),
                       ], className="five columns", style={'padding':10}),
                       # Multivariate Analysis
                       html.Div([
                           dcc.Dropdown(
                                id='dropdown_var1',
                                options=[{'label':i, 'value':i} for i in df.iloc[:,:7]],
                                value='Age',
                                multi=False,
                                searchable=True,
                                clearable=False
                           ),
                           dcc.Dropdown(
                                id='dropdown_var2',
                                options=[{'label':i, 'value':i} for i in df.iloc[:,10:-2]],
                                value='Medium (P1)',
                                multi=False,
                                searchable=True,
                                clearable=False
                           ),
                           html.Div(
                               dcc.Graph(id='sun_multi1'),
                           ),
                       ], className="five columns", style={'backgroundColor': '#2a2b4a', 'padding':10}),
                       # Multivariate Analysis
                       html.Div([
                           dcc.Dropdown(
                                id='dropdown_var3',
                                options=[{'label':i, 'value':i} for i in df[:7]],
                                value='Age',
                                multi=False,
                                searchable=True,
                                clearable=False
                           ),
                           dcc.Dropdown(
                                id='dropdown_var4',
                                options=[{'label':i, 'value':i} for i in df[11:-2]],
                                value='Medium (P1)',
                                multi=False,
                                searchable=True,
                                clearable=False
                           ),
                           html.Div(
                               dcc.Graph(id='sun_multi2'),
                           ),
                       ], className="five columns", style={'backgroundColor': '#2a2b4a', 'padding':10}),
                       
                       html.Div([html.Br(),],className="eleven columns"),
#                        # Third raw
#                        html.Div([
#                            html.H4('Total quantity/store type & products', style={'color':'#003595'}),
#                        ], className="three columns", style={'padding':10}),
#                        html.Div([
#                            html.H4('Total quantity/store type & gender', style={'color':'#003595'}),
#                        ], className="three columns", style={'padding':10}),
#                        html.Div([
#                            html.H4('Total quantity/store type & age', style={'color':'#003595'}),
#                        ], className="three columns", style={'padding':10}),
#                        # Store type vs Products and Sub-Products
#                        html.Div([
#                            dcc.Dropdown(
#                                 id='dropdown_prod',
#                                 options=[
#                                     {'label': 'Store type', 'value': 'Store_type'},
#                                     {'label': 'Product', 'value': 'prod_cat'},
#                                     {'label': 'Sub Product', 'value': 'prod_subcat'},
#                                 ],

#                                 value=['Store_type'],
#                                 multi=True,
#                                 searchable=True,
#                                 clearable=False
#                            ),
#                            html.Div(
#                                dcc.Graph(id='sun_prod'),
#                            ),
#                        ], className="three columns", style={'backgroundColor': '#efedfa', 'border-radius': 25, 'padding':10}),
#                        # Store type vs Gender
#                        html.Div([
#                            dcc.Dropdown(
#                                 id='dropdown_gen',
#                                 options=[
#                                     {'label': 'Store type', 'value': 'Store_type'},
#                                     {'label': 'Gender', 'value': 'Gender'},
#                                 ],

#                                 value=['Store_type'],
#                                 multi=True,
#                                 searchable=True,
#                                 clearable=False
#                            ),
#                            html.Div(
#                                dcc.Graph(id='sun_gen'),
#                            ),
#                        ], className="three columns", style={'backgroundColor': '#efedfa', 'border-radius': 25, 'padding':10}),
#                        # Store type vs Age
#                        html.Div([
#                            dcc.Dropdown(
#                                 id='dropdown_age',
#                                 options=[
#                                     {'label': 'Store type', 'value': 'Store_type'},
#                                     {'label': 'Age category', 'value': 'age_cat'},
#                                 ],

#                                 value=['Store_type'],
#                                 multi=True,
#                                 searchable=True,
#                                 clearable=False
#                            ),
#                            html.Div(
#                                dcc.Graph(id='sun_age'),
#                            ),
#                        ], className="three columns", style={'backgroundColor': '#efedfa', 'border-radius': 25, 'padding':10}),
                       
#                        html.Div([html.Br(),],className="eleven columns"),
#                        # Fourth raw
#                        html.Div([
#                            html.H4('Customers count per city', style={'color':'#003595'}),
#                        ], className="five columns", style={'padding':10}),
#                        html.Div([
#                            html.H4('Correlations between different fields affecting the clients', style={'color':'#003595'}),
#                        ], className="five columns", style={'padding':10}),
#                        # City & No. of Customers
#                        html.Div([
#                                dcc.Graph(id='hist_city'),
#                        ], className="five columns", style={'backgroundColor': '#efedfa', 'border-radius': 25, 'padding':10}),
#                        # Correlation Heatmap
#                        html.Div([
#                            dcc.Dropdown(
#                                 id='dropdown_cor',
#                                 options=[{'label':df.columns[i], 'value':df.columns[i]} for i in range(len(df.columns)) if (df[df.columns[i]].dtype== 'int64') or (df[df.columns[i]].dtype== 'float64')],

#                                 value=['Qty', 'Rate'],
#                                 multi=True,
#                                 searchable=True,
#                                 clearable=False
#                            ),
#                            html.Div(
#                                dcc.Graph(id='heatmap_cor'),
#                            ),
#                        ], className="five columns", style={'backgroundColor': '#efedfa', 'border-radius': 25, 'padding':10}),
                       
#                        html.Div([html.Br(),],className="eleven columns"),
#                        # Fifth raw
#                        # Sentences
                       html.Div([
                           html.H4('The most and least values', style={'color':'#ffffff'}),
                       ], className="eleven columns", style={'padding':10}),
                       html.Div([
                           # dcc.RadioItems(
                           #     id='radio_value',
                           #      options=[
                           #          {'label': 'Highest value', 'value': 'most'},
                           #          {'label': 'Lowest value', 'value': 'least'}
                           #      ],
                           #      value='most',
                           #      labelStyle={'display': 'inline-block'}
                           # ),
                           # html.H4(html.Div(id='sent1')),
                           html.Div(
                               #dash_table.DataTable(df_piv.to_dict('records'), [{"name": i, "id": i} for i in df_piv.columns]),
                           ),
                       ], className="eleven columns", style={'backgroundColor': '#2a2b4a', 'padding':10}),
               ], className="twelve columns", style={'backgroundColor': '#1b203d', 'width':'100%', 'height':'100%', 'top':'0px', 'left':'0px'})

@app.callback(
    Output('hist_grp', 'figure'),
    Input('dropdown_grp', 'value'),
    )
def grp_hist(value):
    for i in range(len(all_g)):
        if i == int(value):
            col_len = len(all_g[i])
            fig = make_subplots(rows=1, cols=col_len)
            for x in range(1, col_len+1):
                x_val=str(all_g[i][x-1])
                fig.add_trace(px.histogram(df, x=x_val, barmode='group', labels=x_val, text_auto=True).data[0], row=1, col=x)
            fig.update_layout({'paper_bgcolor': 'rgba(0, 0, 0, 0)'})
    return fig

@app.callback(
    Output('sun_multi1', 'figure'),
    Input('dropdown_var1', 'value'),
    Input('dropdown_var2', 'value'),
    )
def multi1_sun(value1, value2):
    fig = px.sunburst(data_frame=df, path=(value1, value2))
    fig.update_traces(textinfo='label+percent parent')
    fig.update_layout({'paper_bgcolor': 'rgba(0, 0, 0, 0)'})
    return fig

@app.callback(
    Output('sun_multi2', 'figure'),
    Input('dropdown_var3', 'value'),
    Input('dropdown_var4', 'value'),
    )
def multi2_sun(value1, value2):
    fig = px.sunburst(data_frame=df, path=(value1, value2))
    fig.update_traces(textinfo='label+percent parent')
    fig.update_layout({'paper_bgcolor': 'rgba(0, 0, 0, 0)'})
    return fig

# @app.callback(
#     Output('pv_table', 'children'),
#     Input('pv_table', 'id'),
#     )
# def table_pv(value):
#     df_piv=pd.pivot_table(df,index=["Educational Qualifications","Occupation"], values=['Age','Family size'],
#                                aggfunc=[np.median,len])
#     df_piv.style.bar()
#     dash_piv = dash_table.DataTable(df_piv.to_dict('records'), [{"name": i, "id": i} for i in df_piv.columns])
#     return dash_piv

app.run_server(mode='external')
C:\Users\Monmon\AppData\Local\Temp/ipykernel_11352/2200546115.py:31: FutureWarning:

Using short name for 'orient' is deprecated. Only the options: ('dict', list, 'series', 'split', 'records', 'index') will be used in a future version. Use one of the above to silence this warning.

Build A Dashboard¶

  1. Design your own dashboard.
  2. Follow these steps to deploy it on heroku. https://dash.plotly.com/deployment
In [ ]: